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Abstract 

We introduce a method for solving a self consistent electronic calculation within localized atomic 
orbitals, that allows us to converge to the complete basis set (CBS) limit in a stable, controlled, 
and systematic way. We compare our results with the ones obtained with a standard quantum 
chemistry package for the simple benzene molecule. We find perfect agreement for small basis set 
and show that, within our scheme, it is possible to work with a very large basis in an efficient and 
stable way. Therefore we can avoid to introduce any extrapolation to reach the CBS limit. 

In our study we have also carried out variational Monte Carlo (VMC) and lattice regularized 
diffusion Monte Carlo (LRDMC) with a standard many-body wave function (WF) defined by the 
product of a Slater determinant and a Jastrow factor. Once the Jastrow factor is optimized by 
keeping fixed the Slater determinant provided by our new scheme, we obtain a very good description 
of the atomization energy of the benzene molecule only when the basis of atomic orbitals is large 
enough and close to the CBS limit, yielding the lowest variational energies. 
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^Electronic address: sorella@sissa.it 



I. INTRODUCTION 



The use of localized basis sets is becoming more and more important in several electronic 
structure methods, because it allows a dramatic reduction of the dimension N of the single 
particle basis, i.e. much smaller than standard plane wave basis set dimensions. For instance 
in quantum chemistry calculations it is very difficult, if not impossible, to use post Hartree- 
Fock methods with extremely large basis sets, because their computational cost scales with a 
large power of N (N 4 — N 7 ). On the other hand all linear scaling methods jl-4| are based on 
a suitably localized basis and exploit the fact that the matrix elements of the Hamiltonian 
decay very rapidly with the distance among the orbital centers. Several other applications 
are known, and it is basically impossible to list all of them. 

Although our findings apply generally to all the above issues, in this paper we focus in 
particular on the use of a localized basis set for Monte Carlo optimization of wave functions 
because the number of variational parameters -e.g. in a Slater determinant- is proportional 
to N, and only with a localized basis it remains in a reasonable range even for large systems. 
Therefore, thanks to the recent advances in the optimization techniques in quantum Monte 
Carlo calculations (5 s| , it is possible nowadays to optimize a full many-body wave function 
with an affordable computational time (proportional to N 3 — iV 4 ). 

Despite the use of localized basis sets is becoming more and more popular there is an 
important issue on how to converge to the so called "complete basis set" (CBS) limit. In 
plane wave calculations this is usually achieved by systematically increasing the kinetic en- 
ergy cutoff so that a reliable and very well controlled convergence is possible. Unfortunately 
within a localized basis set framework, until now it is not possible to reach the same level 
of accuracy of the plane wave approach, because a too large value of N leads to redundancy 
of the basis and typically to numerical instabilities. On the other hand, working with a 
small basis far from the CBS limit, leads to the well known basis superposition error, that 
deteriorates the accuracy in the description of the chemical bond. 

The difficulty to work with a localized basis of N non-orthogonal orbitals 0j(r) (i = 
1, • • • , N) can be quantified by considering the N x N overlap matrix: 

Sij = (0,10,) = j d 3 rM7)Mrl- (1) 

In the following we assume that the atomic orbitals are normalized so that the diagonal 
elements of the above overlap matrix are identically equal to one. Since this is an overlap 



matrix it is also positive definite, namely all its eigenvalues Sj, henceforth assumed in as- 
cending order, are positive s« > 0, and a vanishing eigenvalue occurs only if the orbitals are 
linearly dependent. 



If the condition number of this positive definite matrix, namely the ratio s 
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between the largest (sjv < N, as the trace of S is ^ = Sj = iV) and the lowest 
eigenvalue si is below a certain threshold, calculations are very difficult and the convergence 
to the CBS limit is impossible to reach with standard methods. This problem has been 
often circumvented by extrapolation procedures and /or relying on cancellation errors (9 12]. 
However we believe that, at least for a simple self-consistent field (SCF) calculation based 
on the density functional theory (DFT) within the standard local density approximation 
(LDA), it is very important to converge to the CBS limit, essentially in the same way as in 
a plane wave calculation. 

Here we show that this important task can be achieved by applying a simple strategy for 
the diagonalization of the Hamiltonian, strategy that can be applied to any SCF calculation. 
Moreover we find that, even for the atomization energy of the simple benzene molecule, it is 
necessary to work close to the CBS limit in order to obtain a well converged result because, 
at least in this case, energy differences approach the CBS limit only slightly faster than the 
total energy. 

In this work, as already mentioned, we are interested to combine the Slater determinant 
obtained by an SCF calculation with a so called Jastrow factor, also expanded in a localized 
basis 0i(r). In this way an accurate many-body wave function is defined, that typically 
describes rather well the electron correlation, often much better than the original SCF 
calculation. Indeed in our QMC calculations, we have also found that it is very important 
to use a very large basis set for the DFT Kohn-Sham orbitals, because only in this case 
the corresponding Slater determinant is very close to the optimal one, namely the one that 
minimizes the energy in presence of the Jastrow factor. After that we obtain a very accurate 
atomization energy for this simple molecule, that is compatible with the experiments, despite 
our strong restriction of the variational ansatz. 

The paper is organized as follows. In Sec. II we describe the basis of localized atomic 
orbitals and the variational wave function used in this work, we briefly review the SCF 
LDA method and we show how to work with a large basis of localized atomic orbitals. In 
Sec. Ill we present our results for the benzene molecule, and finally in Sec. IV we draw our 
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conclusions. 



II. COMPUTATIONAL DETAILS 



A. Localized basis set 



In our DFT LDA-based calculations, we have used Slater (S) exchange and Perdew [X< 
(P) correlation functionals. The Kohn-Sham (KS) equations are solved by expanding the 
electronic orbitals in a Gaussian type orbital (GTO) basis set. Only four valence electrons 
are taken into account for the carbon atom. The Is core electrons are considered by atomic 

rydrogen atom in order to eliminate the 



pseudo-potentials(PPs), that are also used for the 
divergent electron-ion attraction at short distance 

In the test calculation presented here, we consider the benzene molecule with experimental 
carbon and hydrogen atomic positions. We define a localized basis set centered on each 
atom by using simple Gaussians exp(— Zir 2 ) with given angular momentum: s, p, d, f, g, .... 
In order to achieve convergence in a systematic way we define the standard even-tempered 
sequence for GTO exponents Zf. 

Zi = a? (2) 

for i — 0, • • -rii — 1 with a = Z min and a(3 n ' = Z max , where rii is the number of exponents 
used for the angular momentum I. The maximum number n of exponents is used only on 
the s-wave channel no = n, whereas for the higher angular momenta I = 1,2,3,4,... the 
number of exponents n\ is smaller and is given by following choice: 

m = no- 21 (3) 

Notice that the value of /3 is implicitly defined by the choice of Z max that will be discussed 
in the following. 

This basis is obviously complete as long as n and the maximum I (I < l max ) are systemat- 
ically increased until convergence is reached. The obvious advantage of our even-tempered 
GTO set is that the exponent sequence is determined by only two parameters, Z max and 
Zmin- Our purpose is indeed to show that a systematic convergence with n and l max < 4 
can be obtained by using a large but not prohibitive value of n. In practice for large n, the 
SCF energy is almost independent on Z max and Z min and therefore the choice of these two 
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parameters can be done by optimizing the DFT energy for few test cases, and by checking 
the stability of these two parameter values for large n. As a result we have verified that the 
simple choice Z min = 0.2 and Z max = 10 is nearly optimal for all n in the benzene case, as 
well as for the carbon atom. We have therefore adopted these two parameter values in all 
the forthcoming calculations. Probably by optimizing all exponents much faster convergence 
can be obtained but in this work we want to emphasize that it is possible to work with a 
large basis set, deliberately larger than necessary because not fully optimized, and obtain 
accurate and systematically converging results without limitations or constraints, likewise a 
plane-wave based approach. As a result we show that we can achieve this task by using n as 
large as < 30, that represents a very important restriction of the dimension N of the basis, 
roughly two or three orders of magnitude smaller compared to a plane wave based approach 
with the same accuracy in the total energy, e.g. an accuracy of O.lmH in the total energy 
of the benzene molecule can be achieved with iV ~ 1000 localized orbitals, whereas at least 
200 3 plane- waves are necessary for the same target. 

B. Description of the SCF method 

The DFT functional, within the LDA approximation, can be evaluated on a given set of 
atomic orbitals <pi{r), and is then defined by the overlap matrix S in Eq.(JT|) and the one 
body Hamiltonian matrix elements: 



where H contains Kinetic energy, electron-ion interaction, and the pseudopotentials used, 
whereas v h and v xc are the Hartree and the exchange and correlation potentials, respectively, 
both defined only by the total electronic density n(r). In this non-orthonormal basis the 
Kohn-Sham orbitals are given by: 



where the coefficients Xi.j can De obtained by solving the generalized eigenvalue problem: 




(4) 




(5) 
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The density is in turn defined by occupying the Kohn-Sham orbitals up to the Fermi energy 

n(f) = 2 ^*(rW) (7) 

E^^E p 

and self consistency is reached when the output density obtained after the diagonalization 
coincides within numerical accuracy with the input density used to evaluate vh and v xc . 

In recent years computer performances have substantially increased and the calculation of 
overlap and Hamiltonian matrices elements of the above types can be done rather efficiently 
by straightforward integration over a mesh, e.g. by replacing the integrals with appropriate 
summations over a set of electronic positions fj uniformly distributed in a finite volume 
V = L x x L y x L z , spanned by n x x n y x n z mesh points referred to the x,y,z Cartesian 
axes respectively, namely: 

S i,o = v ^2 < f ) i(ri)(f>j(ri) 
1 

Hi, = \vJ2 [m)(H KS Uri)) + {H KS Uri))Hri)] (8) 

where H KS = H lb +VH+v xc is the Kohn-Sham Hamiltonian and v = ^^"n* is the elementary 
box volume of the mesh grid. Notice that in the above discretization of the integrals we 
have symmetrized the Hamiltonian matrix elements, thus restoring the Hermitian property 
of the Hamiltonian (H it j = Hj ti ) even for a finite mesh. 

Further important improvements have been introduced to allow a fast and efficient con- 
vergence of these matrix elements by increasing the box volume and the size of the mesh 
(n x ,n y ,n z — > 00). For instance the origin of the mesh grid was chosen in a way to maxi- 
mize the minimum distance between the mesh and the ion positions. Moreover the Hartree 
potential in atomic units was calculated by the convolution: 

v H (rj) =vJ2 ~ rjHrj) + Cn(fj) = 1/V A ^ v c q n_ q (9) 

where n q and v c are the density and the Coulomb potential Fourier transform, respectively, 
whereas the finite constant C takes into account the infinite contribution of the Coulomb 
potential for 77 = rj, in a way that will be discussed later on. For open systems, the 
convolution was computed on a box of twice linear dimension with volume V A = 8V and 
with the origin f — (0, 0, 0) at the center of the box, namely n x — > 2n x , n y — > 2n y , n z — > 2n z 
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and \x\ < L x ,\y\ < L y , \z\ < L z . The convoluted density n{f) is assumed to vanish in 
V A outside the physical volume V ( < x < L x ,0 < y < L y ,0 < z < L z ) because the 
charge density is decaying exponentially fast at large distance from the atoms. Then it 
is defined periodic with period (2L x ,2L y ,2L z ), so that wavevectors q are correspondingly 
quantized q = (n/L x n, n/L y m, tt/LJ) with integers n, m, I, and the standard and extremely 
efficient convolution algorithm based on fast Fourier transform can be applied, with v c q = 
v/\rj\e~ iqrj + C. By using this enlarged box V , we can avoid to add fictitious 

rj eV A ,rj^0 

contributions to the Hartree potential, that would have raised from non vanishing replicas 
with shorter periodicity (L x , L y , L z ). It is also simple to show that in the limit n x , n y , n z — > 
oo, the Hartree potential coincides with the exact integral expression in the physical volume 
V, where we solve the eigenvalue equations in Eq.(j6]). Finally, in order to have an efficient 
extrapolation for n x ,n y ,n z — > oo at fixed volume, we have adopted a regularization for 
the Coulomb potential when 77 = rj for open systems. This is obtained by using an 
appropriate constant C, that is determined by imposing that the Hartree potential at r = 
coincides with the exact one for a Gaussian density n(r) = exp(— r 2 /2), namely vh{0) = 



j dr 3 l/r exp(— r 2 /2) = C + v V exp(— rf/2)/rj = Am in this case 15j. Obviously 

C — > for n x ,n y ,n z — > 00 and therefore this regularization is only meant to accelerate the 
convergence but does not change the limit values of all the quantities computed with this 
SCF method. 

All the above simple technical improvements allow us to have a very rapidly convergent 
calculation in n x ,L x — > 00 that is otherwise almost impossible, especially for open systems. 
In our application to the benzene molecule we have chosen a cubic box (n x = n y = n z and 
L x = L y = L z ) and the convergence vs the box L x — > 00 and mesh n x — > 00 size are displaied 
in Fig. CD 

In Tab. [U we show the same type of convergence also for the exchange and correlation 
energies. It is clear that the rapid convergence of our method applies also to correlation 
functions and also that high quality results can be obtained with a reasonable computational 
effort. 

The way we compute matrix elements on a grid is obviously much less efficient then 
the standard way to use tabulated Gaussian integrals [l6|. Nevertheless the calculation we 
propose is quite simple and straightforward, obviously at some expense of computer time, 
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FIG. 1: Convergence of the total DFT energy of the benzene molecule as function of the box size 
L x (left), in (Bohr), and the mesh size n x (right) using the 8s6p even-tempered basis set described 
in the text. 

but with the remarkable advantage that this method can be applied also to non-Gaussian 
orbitals, to periodic system, and that the accuracy of the calculation can be systematically 
controlled by increasing n x and the volume V . This is particularly important when the con- 
dition number s con d becomes extremely large, and, for an accurate calculation, both overlap 
and Hamiltonian matrix elements have to be calculated with much higher accuracy. More- 
over, for application to QMC computation, the time spent for a SCF calculation represents 
only a negligible amount, so that it is not really important that this part of the calculation 
is fully optimized. The most important problem we have solved in this work is the stabiliza- 
tion of the diagonalization routine for arbitrary large basis set and given (namely allowed by 
double precision arithmetic) accuracy in the calculation. This technique will be described 
in details in the next subsection. 



C. Stable diagonalization routine 

In this part we describe our way to improve the accuracy of the SCF algorithm by 
means of a more stable numerical solution of the generalized eigenvalue equations given in 
Eqj6l Basically the task is to compute in a stable way the eigenvectors of the Hamiltonian 
defined in a non-orthogonal basis of large linear dimension N. The problem is that the 
overlap matrix S in Eq.flT]) may have a very large condition number and a straightforward 
diagonalization leads to inaccurate and often dirty eigenvectors. In parallel computation 



S 



there is also the further complication that it is difficult to preserve orthogonality (and 
accuracy) of eigenvectors without having a huge buffer memory at disposal, as no efficient 
memory distribution is possible for the orthogonalization procedure. Here we describe the 
algorithm of our diagonalization routine, that solves in a very simple and efficient way all 
the above computational issues: 

• We apply first a diagonalization algorithm to the overlap matrix, based on the House- 
holder tridiagonalization and iterative Givens transform to the resulting tridiagonal 
matrix. All these transformations are unitary and should preserve orthogonality of 
eigenvectors uj for i,j = 1, • • • , N in infinite precision arithmetic. However this is not 
the case in practice since the matrix can be very ill-conditioned. Therefore, in order 
to improve the stability of the calculation we follow standard procedures in numerical 
linear algebra [17j. We disregard eigenvectors with small eigenvalues Sj compared to 
the maximum one < N, namely satisfying Si/sjy < e mac h, where e mac h is an input 
parameter, whose minimum value is around the relative machine precision (~ 10~ 16 



in double precision arithmetic) [2 1|. Neglecting small eigenvectors corresponding to 
small eigenvalues of the overlap matrix is justified from the fact that an eigenvector 
of the overlap matrix with nearly zero eigenvalue corresponds to a linear combination 
of normalized orbitals ^2jVj<pj(r), satisfying J^. |t>*| 2 = 1, that has almost vanishing 
norm equal to ^fsl, i.e. the non-orthogonal basis of normalized orbitals is redundant 
and this direction can be safely eliminated within an error -J~sl. Thus after neglect- 
ing all these singular directions we obtain a simple bound for the numerical error e acc 
expected when neglecting all these redundant directions: 

e acc < Mini (y^ | Si/s N > e mach ) < yfe mach s N < e mach N, (10) 

where in the last inequality we have used that sn < N, as shown before. 

In this step we do not require that the diagonalization has produced really orthog- 
onal eigenvectors but only that the eigenvalues have the right order of magnitude, 
the normalization of eigenvectors is correct, and that the diagonalization routine has 
produced linearly independent eigenvectors, properties that are easily satisfied even 
for extremely singular overlap matrices. Then we define non-singular directions: 

e} = —j=v) for Si/s N > e mach (11) 

f Si 



In this basis, the overlap matrix Sij = (ei\S\ej) should be equal to the identity 
in infinite precision arithmetic, namely a matrix with minimum condition number 
Scond = 1- Thus it turns out that, in finite precision arithmetic, by recomputing 
§ij = t el.Skjej we obtain a well-conditioned matrix that can be diagonalized again 
with high accuracy: 

hj = J2 Sk Vitf (12) 

k 

and therefore now, analogously to the previous case, we can define directions 




that remain safely saorthonormal even in finite precision arithmetic. Finally we store 
the global transformation from the original basis to the new one 

Uij = J2 e H ( 13 ) 

k 

In this basis the generalized eigenvalue equation ([6]) turns in a standard diagonaliza- 
tion of a very well-conditioned Hamiltonian matrix H = UHU^ because its spectrum 
corresponds to the physical spectrum of the original Hamiltonian. 

• At each iteration we recompute the Hamiltonian matrix H in this new basis. 

• Then apply again our diagonalization routine. 

• Go back to the original basis W- = ^ k Uj tk Wf, for the eigenvectors Wj {WD of H (H), 
computed at each iteration to implement self consistency and write the final molecular 
orbitals in the localized atomic basis set. 

The above scheme allows us to obtain a total energy accuracy that, at least in the 
examples studied in this work, is below O.lmH, namely one order of magnitude smaller 
than the typical target chemical accuracy. This is achieved in an automatic way, by using 
e mac h — 10~ 16 . Namely, even when the basis used is extremely redundant for iV — > oo, 
the simple algorithm, that we have previously described, allows us to obtain accurate total 
energies. This solution is general in the following sense: with the present algorithm it is 
in principle possible to work with arbitrarily large value of the basis dimension N, namely 
even with an over-complete basis set, and, after removing redundant directions as described 
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FIG. 2: Total DFT energy of benzene molecule in n = 20 case as a function of our regularization 
cutoff e mach . 

in the first step of the algorithm, we can work with a well-conditioned orthonormal basis, 
obtained after the second diagonalization, and that provides for large N converged results 
within the numerical accuracy e acc possible with the available numerical precision. Since, 
as shown before in Eq. (fT0|) . e acc increases very slowly with N, namely at most as oc V~N, 
we do not expect accuracy problems with basis sets much larger than the ones used in the 
examples presented here, even within double precision arithmetic. 

As shown in Fig. [21 the accuracy can be also controlled by decreasing the value of e mac h, 
a lower value leads to a more accurate calculation, until one reaches a threshold below 
the relative machine precision when instabilities occurs in the diagonalization because the 
numerical accuracy of finite precision arithmetic is simply not enough. Better accuracies 
could be in principle possible only with more accurate operations, e.g. by using quadruple 
precision. Nevertheless, as anticipated, this algorithm provides a reasonable accuracy in the 
total energy with standard, and usually much more efficient, double precision arithmetic. 

D. Correlated variational wave function for QMC calculations 

In this part we describe our WF which has been used in QMC calculations. The usual 
trial wave function (WF) used in QMC calculation is the product of an antisymmetric part 
and a Jastrow factor. The antisymmetric part is a single Slater determinant, while the 
Jastrow factor is a bosonic many body function which accounts for the dynamic correlations 
in the system. Our Slater determinant is obtained with N e i/2 (N e i is the total number 
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TABLE I: Total, exchange and correlation energies of DFT calculation of the benzene molecule 
using 24s22pl0d6/ basis set. A cubic box (L x = L y = L z ) is used. The given values of energies 
and box sizes are in (H) and (Bohr), respectively. 



n x 


L x 


Total energy 


Exchange energy 


Correlation energy 


160 


16 


-37.5404624 


-10.6865190 


-1.5922693 


200 


20 


-37.5400762 


-10.6867874 


-1.5922738 


240 


24 


-37.5400821 


-10.6867701 


-1.5922721 


400 


20 


-37.5400657 


-10.6867885 


-1.5922732 



of electrons in the system) doubly occupied molecular orbitals ipj{r), expanded in atomic 
orbitals as described in Eq.(j5]). The molecular orbitals are obtained from the self consistent 
DFT-LDA calculations explained in the previous section. The Jastrow factor takes into 
account the electronic correlation between two electrons and is conventionally split into an 
homogeneous interaction J 2 depending on the relative distance between two electrons, and 
a non homogeneous contribution depending on the positions of one or two atoms, J3 and 
J4 respectively. It also contains a one particle term Ji, that is important to compensate 
the change in the one particle density induced by J 2 , J3 and J4, as well as to satisfy the 
electron-ion cusp conditions. The one- and two-body terms J\ and J 2 are defined by the 
following equations: 



j x = ex P [J2 -vz a fi\(z)i\ i0 ) + J2 gixim , (14) 

ia ial 

and 

J2 = exp[^u(r#)], (15) 

i<j 

where i,j are indices running over the electrons, and I runs over different single particle 
orbitals xii centered on the atomic center a. ri a and denote electron-ion and electron- 
electron distances respectively. The corresponding cusp conditions are fixed by the function 
u(r) = F[l — exp(— r/F)]/2 (see e.g. Ref. jl8|). gf and F are optimizable variational 
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parameters. The three and four-body Jastrow J3J4 are given by: 

J 3 J 4 (^) = exp(^/(f i ,^)), (16) 

with f(fi,fj), being a two-electron coordinate function that can be expanded into the same 
single-particle basis used for J\\ 

f(fi,?j) = Y,9tx J al {n)xL{r 3 ), (17) 

ablm 

with gf^ optimizable parameters. Three-body (electron ion electron) correlations are de- 
scribed by the diagonal matrix elements g aa , whereas four-body correlations (electron ion 
electron ion) are described by matrix elements with a 7^ b. 

The exhaustive and complete expression of the Jastrow factor J{R) = 
Ji{R)J2{R)J^{R)Ja{R) that we adopt in this work allows us to take into account not 
only weak electron-electron interactions, but it is also extremely effective for suppressing 
higher energy configurations occurring when electrons are too close. 



III. RESULTS AND DISCUSSION 



In this section we show the remarkable convergence and stability properties of our method 
for the calculation of the total and atomization energies of the benzene molecule in the CBS 
limit. To this purpose we consider an atomic basis with l max < 1, namely with only s and 
p type of orbitals allowed, and show that it is possible to converge to the n — > 00 case 
even when, for large n, too many orbitals of the same angular momentum become highly 
redundant and are difficult to treat with standard methods. 

In Fig. [3] we compared our total DFT energies of the carbon atom and the benzene 



molecule with the ones obtained with the GAUSSIAN09 package [16J. We have used both 
for GAUSSIAN09 and our DFT algorithm exactly the same basis sets and pseudopotentials, 
treated with maximum accuracy in the angular integration, and therefore with negligible 
error, as well as the same Slater exchange and correlation functional with the standard 



Perdew-Zunger parameterization 
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For the small basis sets (n s = 6, 8) we are in excellent agreement with GAUSSIAN09. 
However, with large basis sets (n s > 8) there is a clear difference between our results and 
the GAUSSIAN09 ones. We do not know exactly what is the reason of this discrepancy. We 
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FIG. 3: Upper panels: convergence of the total DFT energy of the carbon atom (left) and the 
benzene molecule (right) using an atomic basis set containing only s and p angular momenta. 
Results obtained by GAUSSIAN09 are also shown for comparison. Lower panel, Logarithm of the 
condition number s con d of the overlap matrix for the carbon atom (left) and the benzene molecule 
(right). 

TABLE II: Total DFT energy for the benzene molecule as a function of the highest angular momen- 
tum of the atomic basis. Calculations were done with pseudopotentials [l^j] both for the hydrogen 
and carbon atoms at the experimental equilibrium positions, while for the hydrogen we have used 
a 3s2p basis. 



L 


Basis 


C-composition 


number of primitive Gaussian 


Energy (H) 


1 


sp limit 


24s22p 


594 


-37.4952567 


2 


spd limit 


24s22pl0d 


894 


-37.5357514 


3 


spdf limit 


24s22pl0d6f 


1146 


-37.5400821 


4 


spdfg limit 


24s22pl0d6f2g 


1254 


-37.5416815 
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just report that, as it is shown in the lower panels of Fig. [3j the condition number s con d, is 
increasing quite rapidly with the dimension N of the basis set, and the discrepancy between 
our results and GAUSSIAN09 is evident when the condition number becomes larger than 
~ 10 8 . Moreover since DFT is a variational method, it should be clear that for a given basis 
set the method that provides the lowest value of the functional should be the most accurate, 
provided the value of the functional can be calculated accurately. Indeed we have verified 
that a large condition number s con d ~ 10 17 affects only the self consistent step (in our case the 
diagonalization in Eqj6]) but allows to compute the value of the functional without particular 



problems 



221 ] . Therefore we can safely state that our Kohn-Sham molecular orbitals are very 



well converged, whereas standard methods suffers to work already with condition number 
larger than 10 8 . 

As a matter of fact our DFT energy for the benzene molecule converges to a very good 
value -37.4951453(#), (e.g. the GAUSSIAN09 result in the same basis is -37.4886324(#)). 
The quality of our very well converged molecular orbitals is also evident when we compute 
the expectation value of the energy of the corresponding Slater determinant by using the 
standard variational Monte Carlo technique to compute the energy expectation values (see 
Fig. H]). Though it is clearly inefficient to use VMC to compute energy expectation values of 
uncorrelated wave function, it is useful to use this method in this case because it does not 
require the explicit evaluation of the matrix S with very large condition number. 

Moreover the study of the n = 20 case as a function of our regularization cutoff e mac h, 
displaied in Fig. [2j clearly shows that we can reach a sufficient accuracy (~ O.lmH) in the 
total energy even in the large basis set limit, and that therefore the results for n = 24 should 
be considered well converged, namely close to the n — > oo, l max = 1 case. 

It is also particularly interesting to show how the atomization energy converges in this 
case. In Fig. [5] our results clearly indicate that the so called basis superposition error is very 
important in this case and monotonically disappears only for large n. 

Finally we study the convergence of the total energy as a function of the maximum 
angular momentum l max of the atomic basis. DFT energies are shown in Tab. HI1 Though 
the difference between l max = 4 (spdfg) and l max = 3 (spdf) is larger than lmH, this table 
shows that the convergence with l max is quite rapid, as each time l max is increased by a unit 
the accuracy improves by more than a factor 3, and therefore for l max = 4 we should be very 
close to the CBS limit, well within lmH accuracy in the total energy. 
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FIG. 4: VMC computation of the total energy E vmc of the Slater determinant obtained with Kohn- 
Sham molecular orbitals. For comparison we show also the results obtained with GAUSSIAN09. 




5 10 15 20 25 

n 



FIG. 5: DFT atomization energy of the benzene molecule calculated by using a basis set containing 
only s and p atomic orbitals (l max = 1 limit). Results obtained by GAUSSIAN09 are also shown 
for comparison. 

In the following we show the importance of being close to the CBS limit in QMC calcu- 
lations obtained either by optimizing the Jastrow factor over the Slater determinant defined 
by the DFT Kohn-Sham molecular orbitals or by full optimizing both the Jastrow and the 
molecular orbitals with the method described in Ref. [5[, starting from the former initial 
wave function. We indicate in the following the first wave function by J — DFT — WF, 
whereas the latter one will be denoted by J — OPT — WF. 

A 4s3p2d (lslp) uncontracted Gaussian basis set was used for expanding the Jastrow 
factor around each carbon (hydrogen) atom. Fig. E] shows the convergence of VMC total 
energy of the benzene molecule with J — DFT — WF, which an atomic basis used for the 
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-37.596 




n 



FIG. 6: VMC total energy using the J—DFT—WF, denned in the text, for the benzene molecule as 
a function of the number of s and p orbitals. We have used DFT molecular orbital as a determinant 
part of trial wave function and only the Jastrow factor was optimized. 

TABLE III: VMC total energy (H) of benzene molecule by using the J — DFT — WF defined in 
the text. 



/ 


Basis 


C-composition 


Energy (H) 


1 


sp limit 


24s22p 


-37.6111(4) 


2 


spd limit 


24s22pl0d 


-37.6386(4) 


3 


spdf limit 


24s22pl0d6f 


-37.6478(4) 


4 


spdfg limit 


24s22pl0d6f2g 


-37.6480(4) 



Slater determinant with l max = 1. This picture shows that the presence of the Jastrow 
factor improves the convergence to the n — > oo limit. For instance the DFT total energy 
converges when the basis contains more than 24s and 22p. Instead, by using the Jastrow 
factor, clear convergence is reached already for a 16sl4p basis set. For fixed maximum 
angular momentum of the atomic basis l max in the Slater determinant, we have carried out 
VMC calculations of the total energy of the benzene molecule. As shown in Tab. II I II the 
total energy difference between the l max = 1 and the l max = 2 cases is about 0.75(eV), while 
this difference for l max = 2 and l max = 3 shrinks to 0.25(eV). Moreover our VMC results 
suggest that it is not necessary to add g orbital to the basis set, for an accuracy smaller 
than 0.005(eV). 
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TABLE IV: VMC atomization energy of the benzene molecule obtained with the J — DFT — WF 
defined in the text. 



/ 


Basis 


C-composition 


Binding Energy (eV) 


1 


sp limit 


24s22p 


58.38(4) 


2 


spd limit 


24s22pl0d 


59.12(4) 


3 


spdf limit 


24s22pl0d6f 


59.37(4) 


4 


spdfg limit 


24s22pl0d6f2g 


59.37(4) 



The VMC atomization energy of the benzene molecule within the J — DFT — WF for 
fixed Imax is reported in Tab. IIVI The estimated exact atomization energy of the benzene 

lell correlation, that we assume to be 
. Therefore, by optimizing the Jastrow 



molecule is 59.33(3) (eV) [L9j, by neglecting inner-s 
negligible with the pseudopotentials we have used 



factor, one can get almost exact atomization energy as long as a large basis set is considered 
for the DFT Slater determinant. 

One of the main outcome of our work, is that the DFT Slater-determinant is a very good 
input for QMC calculations, provided the basis used is sufficiently large. In fact by full 
optimization of both the Jastrow and the determinantal parts of the WF and by using a 
large basis for the Slater determinant 24s22pl0c?6/, the VMC total energy of the benzene 
molecule is —37.6491 (3) (if). Hence, the difference between the VMC total energy using 
the J - DFT - WF and the J - OPT - WF is very small 0.035(eV) and this shows 
that, by optimizing the determinantal part of the WF, the total energy improves only by 
a small amount and does not appreciably change the atomization energy of benzene, from 
59.37(4) (eV) to 59.41 (3) (eV). Remarkably from a DFT-LDA atomization energy that is 
completely wrong by ~ 10 (eV), we can obtain an almost exact atomization energy using 
the J — DFT — WF wave function with the Kohn-Sham molecular orbitals obtained with 
the energetically poor DFT-LDA method. 

A much different behaviour is obtained when the wave function is fully optimized within 
a small basis set. Indeed, we have applied full optimization on the smallest basis set 6s4p, 
and the VMC total energy using the J - OPT - WF is —37.6384(5) (H), which is 1.12(eV) 
below the corresponding J — DFT — WF energy. This energy gain is more than one order 
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TABLE V: LRDMC total and atomization energies of the benzene molecule obtained by the J — 
DFT - WF and the J - OPT - WF, described in the text. 





J - DFT - WF 


J - OPT - WF 


Total energy (H) 
Binding energy (eV) 


-37.7111(5) 
59.41(5) 


-37.7128(4) 
59.45(4) 



of magnitude larger than the one obtained in the previous case. Therefore we conclude that 
the Kohn-Sham molecular orbitals are very accurate only when a sufficiently large basis is 
used in the DFT calculation. 

In this limit, in order to show the quality of our variational wave functions, we have 
carried out LRDMC [24j calculations using the J - DFT - WF and the J - OPT - WF. 
The LRDMC total and atomization energies of the benzene molecule are shown in Tab. PVT . 
Though the LRDMC improves the VMC total energy of the benzene molecule by about 
~ 1.7(eV), the atomization energy remains unchanged within the statistical errors. 

IV. CONCLUSION 

We have introduced a very simple method to make accurate and systematically converging 
SCF calculations with localized basis sets of increasing dimension. This work shows that 
the use of a large basis set may be extremely important for accurate calculations. It is 
possible that our method could be relevant also for defining more efficient electronic structure 
packages that are free of any limitation about basis set dimension. Preliminary application 



of the method to periodic systems are also extremely encouraging 20], because one can work 
also in this case with a localized basis set, with convergence properties similar to plane 
wave DFT methods. For large extended systems the condition number of a GTO localized 
basis set increases quite rapidly with the system size but we have tested in Silicon with 
a supercell containing up to 256 atoms (i.e. 1024 valence electrons) that our algorithm 
remains stable for fixed choice of the parameter e mac h — 10~ 16 used to remove the singular 
directions. Although it impossible to obtain the error of the finite basis used (8s6p4d per 
atom) because the CBS limit is computationally too expensive for large number of electrons, 
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by comparing our results with standard plane wave methods it turns out that our accuracy 
should remain constant for the energy per atom. This is expected from general grounds, 
since for extended systems the condition number of a localized basis set, defined by a fixed 
number of orbitals per atom, should saturate in the infinite volume limit, when orbitals 
corresponding to atoms that are very far apart remain orthogonal, because they do not 
overlap. As we have already remarked before better accuracy -probably necessary for large 
extended systems when for instance the chemical accuracy in the total energy is required- 
can be in principle possible with much smaller e mac h, that can be used only with a more 
accurate arithmetic (e.g. quadruple precision). 

Moreover we have shown that, for QMC applications, our method is extremely useful, 
because only in the large basis set limit the output molecular orbitals of our new SCF 
calculation define an extremely accurate Slater determinant, that essentially, does not need 
to be optimized. This work also highlights a remarkable property of the DFT method, 
namely that , rather surprisingly, the Kohn-Sham molecular orbitals are rather robust and 
stable in the large basis set limit, and do not seem to be very much sensitive to the accuracy 
of the functional used. In the benzene example for instance, the accuracy in the atomization 
energy with the LDA functional used was very poor with an error of about ~ lOeT^, whereas 
when the same Kohn-Sham orbitals are used for standard QMC calculation, they provide 
almost optimal results, not only compatible with experiments, but also stable against further 
optimization of the energy in presence of the Jastrow factor. 
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